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Abstract 

In this paper we derive results concerning the connected components and the diameter of 
random graphs with an arbitrary i.i.d. degree sequence. We study these properties primarily, 
but not exclusively, when the tail of the degree distribution is regularly varying with exponent 
1 — t. There are three distinct cases: (i) r > 3, where the degrees have finite variance, (ii) 
t e (2,3), where the degrees have infinite variance, but finite mean, and (iii) r G (1,2), where 
the degrees have infinite mean. These random graphs can serve as models for complex networks 
where degree power laws are observed. 

Our results are twofold. First, we give a criterion when there exists a unique largest connected 
component of size proportional to the size of the graph, and study sizes of the other connected 
components. Secondly, we establish a phase transition for the diameter when r G (2, 3). Indeed, 
we show that for r > 2 and when nodes with degree 2 are present with positive probability, the 
diameter of the random graph is, with high probability, bounded below by a constant times the 
logarithm of the size of the graph. On the other hand, assuming that all degrees are at least 3 or 
more, we show that, for r G (2,3), the diameter of the graph is with high probability bounded 
from above by a constant times the log log of the size of the graph. 

1 Introduction 

Random graph models for complex networks have received a tremendous amount of attention in 
the past decade. Measurements have shown that many real networks share two properties. The 
first fundamental network property is the fact that typical distances between nodes are small. This 
is called the 'small world' phenomenon (see [27]). For example, in the Internet, IP-packets cannot 
use more than a threshold of physical links, and if the distances in terms of the physical links would 
be large, e-mail service would simply break down. Thus, the graph of the Internet has evolved in 
such a way that typical distances are relatively small, even though the Internet is rather large. The 
second and maybe more surprising property of many networks is that the number of nodes with 
degree k falls off as an inverse power of k. This is called a 'power law degree sequence', and resulting 
graphs often go under the name 'scale-free graphs', which refers to the fact that the asymptotics 
of the degree sequence is independent of the size of the graph (see [15] ) . We refer to [2j [2U [26] 
and the references therein for an introduction to complex networks and many examples where the 
above two properties hold. 

The observation that many real networks have the above two properties has incited a burst of 
activity in network modeling using random graphs. These models can be divided into two distinct 
types: 'static' models, where we model a graph of a given size as a time snap of a real network, 
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and 'dynamical' models, where we model the growth of the network. Static models aim to describe 
real networks and their topology at a given time instant. Dynamical models aim to explain how 
the networks came to be as they are. Such explanations often focus on the growth of the network 
as a way to explain the power law degree sequences by means of 'preferential attachment' growth 
rules, where added nodes and edges are more likely to be attached to nodes that already have large 
degrees. See [5] for a popular account of preferential attachment. 

The random graph where the degrees are i.i.d. is sometimes called the configuration model (see 
[24| ) . In this paper, we study properties of the connected components in the random graph with 
i.i.d. degrees, and prove results concerning the scaling of the largest and second largest connected 
components, as well as the diameter. 

The remainder of this introduction is organized as follows. In Section [TT1 we start by introducing 
the configuration model, and in Section H. 21 we discuss the new results concerning component sizes 
and diameter of this graph. We describe related work and open questions in Section 11.31 We 
complete the introduction with the organization of the paper in Section 11.41 

1.1 The configuration model 

Fix an integer N. Consider an i.i.d. sequence of random variables D\, D2, ■ ■ ■ , D N . We will construct 
an undirected graph with N nodes where node j has degree Dj. We will assume that L N = Ylf=i Dj 
is even. If L N is odd, then we add a stub to the N th node, so that D N is increased by 1. This 
single change will make hardly any difference in what follows, and we will ignore this effect. We 
will later specify the distribution of D\ . 

To construct the graph, we have N separate nodes and incident to node j, we have Dj stubs 
or half-edges. All stubs need to be connected to build the graph. The stubs are numbered in a 
given order from 1 to L N . We start by connecting at random the first stub with one of the L N — 1 
remaining stubs. Once paired, two stubs form a single edge of the graph. Hence, a stub can be 
seen as the left or the right half of an edge. We continue the procedure of randomly choosing and 
pairing the stubs until all stubs are connected. Unfortunately, nodes having self-loops may occur. 
However, self- loops are scarce when N — > 00, as shown in [7]. 

The above model is a variant of the configuration model, which, given a degree sequence, is 
the random graph with that given degree sequence. The degree sequence of a graph is the vector 
of which the /c th coordinate equals the fraction of nodes with degree k. In our model, by the law 
of large numbers, the degree sequence is close to the distribution of the nodal degree D of which 
D\, . . . , D N are i.i.d. copies. 

The probability mass function and the distribution function of the nodal degree law are denoted 

by 

P(D 1 = fc) = / fcj A; = 1,2,..., and F(x) = ^f k , (1.1) 

k=l 

where [x\ is the largest integer smaller than or equal to x. We pay special attention to distributions 
of the form 

1 _ F(x) = x 1_T L(x), (1.2) 

where r > 1 and L is slowly varying at infinity. This means that the random variables Dj obey 
a power law, and the factor L is meant to generalize the model. For one of our main results 
(Theorem II .21 b elow) we assume the following more specific conditions, splitting between the cases 
T G (l,2),r G (2,3) and r > 3: 

Assumption 1.1 (i) For r G (1,2), we assume U.fy) . 

(ii) For t G (2, 3), we assume that there exists 7 G [0, 1) and C > such that 

x i-T-c(io g xy>-i < 1 _ F(x) < x i- T+C (io g xr-^ for large x _ (L3) 



2 



(iii) For r > 3, we assume that there exists a constant c > such that 



1 - F(x) < ex 1 "", for all x > 1, (1.4) 
and that v > 1, where v is given by 

E[D 1 (D 1 - 1)] 



E[£>! 



;i.5) 



Distributions satisfying (11. 4j) include distributions which have a lighter tail than a power law, and 
(|1.4|) is only slightly stronger than assuming finite variance. The condition in (|1.3|) is slightly 
stronger than f) 1 . 2 j) . 



1.2 Connected components and diameter of the random graph 

In this paper, we prove results concerning the sizes of the connected components in the random 
graph, and give bounds on the diameter. 

For these results, we need some additional notation. For r > 2, we introduce a delayed branching 
process {Z n } n >i, where in the first generation the offspring distribution is chosen according to (jl.ip 
and in the second and further generations the offspring is chosen in accordance to g given by 

gk= ( k + jO/fc+i ^ fe = 0, 1, . . . , where fi = E[D 1 }. (1.6) 
A* 

In the statements below, we write G for the random graph with degree distribution given by 
(jl.ip . and we denote for r > 2 the survival probability of the delayed branching process {2 n } 
described above by q. When 1 < r < 2, for which [i = K[Di] = oo, we define g = 1. We define, for 
5 > 0, 

7? = , 1 + f 9 (r>2), tZ = LZ±(1 + 6) (re (1,2)). (1.7) 
log fi — log 2 2 — r 

In the sequel we use the abbreviation whp to denote that a statement holds with probability 
1 - o(l) as TV -> oo. 

Theorem 1.2 (The giant component) Fix 6 > 0. When Assumption li.il holds and q £ (0, 1], 
then, whp, the largest connected component in G has qN(l + o(l)) nodes, and all other connected 
components have at most 7| nodes when r £ (1,2), and at most ^logN nodes when r > 2 and 

Theorem 11.21 is similar in spirit to the main results in [22j[23|, where the connected components 
in the configuration model were studied for fixed degrees, rather than i.i.d. degrees. In |22[ |23j . 
however, restrictions were posed on the maximal degree. Indeed, for the asymptotics of the largest 
connected component, it was assumed that the maximal degree is bounded by N±~ £ , for some 
e > 0. Since, for i.i.d. degrees, the maximal degree is of the order N T ^ I+ °^ 1 \ where r > 1 is the 
degree exponent, this restricts to r > 5. Theorem 11.21 allows for any r > 1, at the expense of 
the assumption that [i > 2 and the fact that the degrees are i.i.d. The latter restriction comes 
from the fact that, in the proofs, we make essential use of the results in |17^ \19\ I20| . However, a 
close inspection of the proofs in [T71 [HI [20] shows that independence is not exactly needed. This 
is explained in more detail for the case that r G (2,3) in [20]. The restriction fj, > 2 is somewhat 
unusual, and is not present in [221 123] . However, for most real networks, this condition is satisfied 
(see e.g., [HE1]). 

In [19j . a similar result as Theorem 11.21 was proved for the case when r > 3, using the results of 
[221 123] . without the assumption that [i > 2. In this case, the main restriction is that v > 1. This 
result is proved by suitably adapting the graph by erasing some edges from the nodes with degree 
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larger than Ni~ £ . The result in [22\ [23] can be restated as saying that the largest component is 
qN(l + o(l)) when v > 1 and is o(N) when v < 1. Our result applies in certain cases where the 
results of [221 [23] do not apply (such as the cases when r E (2, 3) and r E (1, 2)), and our proof is 
relatively simple and yields rather explicit bounds. 

The proof of Theorem 11.21 is organized as follows. In [ID] and [2D], respectively, it was shown 
that for t > 2 the probability that two nodes are connected is asymptotically equal to q 2 , where 
q arises as the survival probability of the branching process approximation of the shortest-path 
graph from a given node. For r E (1,2) this branching process is not defined and we use the 
convention q = 1, because for r E (1, 2) it was shown in [17] that the probability that two arbitrary 
nodes are connected equals 1 with high probability. These results suggest that there exists a largest 
connected component of size roughly equal to qN . The proofs in [191 [20] rely on branching process 
comparisons of the number of nodes that can be reached within k steps. The proof in [T7] relies 
mainly on extreme value theory. The main ingredient in the proof of Theorem 11.21 is that we show 
that, when \x > 2, any connected component is either very large, or bounded above by 7| when 
r E (1, 2), and by 7^ log N when r > 2 (see Proposition 13.31 below) . The proof of these facts again 
relies on branching process comparisons, using the detailed estimates obtained in [191 120] . Since 
any two nodes are connected to each other with positive probability, there must be at least one 
such large connected component. The proof is completed by showing that this largest connected 
component of size proportional to the size of the graph is unique, and that its size is close to qN. 

While Theorem 11.21 provides good upper bounds on the second largest component and detailed 
asymptotics on the largest component, it leaves a number of questions open. For example, how large 
is the second largest component, and, when q = 1, is the graph connected? We next investigate 
these questions. 

The following theorem says that 7* and 72 defined in (|1.7j) provide quite sharp estimates for the 
size of the connected components that are not the largest in the random graph. Define for /1 > 0, 

7T = i 1 ~f af (r>2), j** = T — 1(1-5) (rE (1,2)). (1.8) 
log /x - log /1 2 - T 



Theorem 1.3 (Sizes of non-giant components) (i) Let r E (1,2) and f\ > 0. Then, for 
any 5 > and k < 72*, and such that fk > 0, whp the random graph contains a connected 
component with k + 1 nodes. 

(ii) Let t > 2 and [i > f\ > 0, and assume that 

f k = L f (k)k- T , k^oo, (1.9) 

where £/(•) is a slowly varying function. Then, for any 5 > 0, and k = k N < 7** log N and 
such that fk > 0, whp the random graph contains a connected component with k + 1 nodes. 

We present some further results in the more special case when q = 1. In this case, either /1 = 00 
or D > 2 a.s. Then, from Theorem 11.21 we have that there exists a unique connected component of 
size ./V — o(N) and all other connected components are much smaller. For this case we investigate 
when the random graph is whp connected. Let Cjv denote the number of nodes in the complement 
of the largest connected component of the random graph. 

Theorem 1.4 (Size of complement of giant component) (i) Let P(D > 2) = 1 and 2 < 

/i < 00. Then, there exists a < 1 and b > such that for each 1 < k < N, as N — > 00, 

P(C JV > k) < ba k . (1.10) 

(ii) IfF(D > 3) = 1, then 

lim P(C JV = 0) = 1. (1.11) 
Consequently, in the latter case, the random graph is connected whp. 
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(iii) The conclusion hl.ll\) also holds when f/, = oo and ¥(D > 2) = 1 instead of¥(D > 3) = 1. 

(iv) The conclusion 111.11]) also holds when L N /N 2 — > oo in probability without any further re- 
strictions on the degree distribution. 

Clearly, the restriction ¥{D > 2) = 1 is necessary to obtain a connected graph, but it is not 
sufficient, as we will see below. Equation (ll.lOj) establishes that the complement of the connected 
component has exponential tails. When fi = P(D = 2) > 0, then it is not hard to see that the 
expected number of pairs of nodes with degree equal to two that are connected to each other, for 
H < oo, is asymptotically equal to 



Indeed, (Nf 2 ) 2 /2 is roughly equal to the number of pairs of nodes with degree 2, and 2(L JV )~ 2 is 
roughly equal to the probability that the two stubs between these two nodes are connected to each 
other. The above mean is strictly positive, which suggests that the number of pairs of nodes with 
degree equal to two that are connected to each other is with strictly positive probability positive. 
We believe that the proof in [6j Section 2.4] can be followed to show that this number is close 
to a Poisson distribution with parameter (fi/fi) 2 . Similar computations can be performed for the 
number of cycles of length 3 or larger consisting of nodes with degree precisely equal to 2. Thus, for 
fi > 0, (11.101) seems the best possible result. We show in Theorem 11.4( h) that, when f\ + fi = 0, 
the graph is connected whp. The same result holds (see Theorem 11.41 (iii) and (iv)) when /i = 
and ^ = oo, or when L N /N 2 — > oo in probability. 

Finally, we give, in Theorem II .51 and 11.61 below, bounds on the diameter of the graph, which we 
define as the largest distance between any two nodes that are connected: 

Theorem 1.5 (Lower bound on diameter) For r > 2, assuming that f± + fi > and fi < 1, 

there exists a positive constant a such that whp the diameter of G is bounded below by a log N , as 



The result in Theorem 11.51 is most interesting in the case when r E (2,3). Indeed, by [201 
Theorem 1.2], the typical distance for r G (2,3) is proportional to log log N, whereas we show here 
that the diameter is bounded below by a constant times logiV when /i + /2 > and f\ < 1. 
Therefore, we see that the average distance and the diameter are of a different order of magnitude, 
which is rather interesting. The pairs of nodes where the distance is of the order logiV are thus 
scarce. The proof of Theorem 11.51 reveals that these pairs are along long lines of vertices with degree 
2 that are connected to each other. 

We end with a theorem stating that when r G (2,3), the above assumption that f\ + fi > 
is necessary and sufficient for log N lower bounds on the diameter. We assume that there exists a 
r G (2, 3) such that, for some c > and all x > 1, 



Observe that f| 1 . 13[) is strictly weaker than (|1.3p . Then the main result is as follows: 

Theorem 1.6 (Upper bound on diameter) Assume that fi + fi = and that (I1.13P holds. 
Then, there exists a positive constant C F such that whp the diameter of G is bounded above by 
C F log log N, as N — > oo. 

In the course of the proof of Theorem 11.61 we will establish an explicit expression for C F in 
terms of F. 

We remark that Theorems 1 1 . 31fl~6l do not rely on Assumption ll.il while Theorem 1 1.2 1 does. The 
reason for this is that the proof of Theorem 11.21 relies on the results proved in [171 [T9l [20] , while 
the proofs of Theorems ll.31fL6l are completely self-contained. 




(1.12) 



N 



oo. 



1 — F{x) > cx 



(1.13) 
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1.3 Related results and open problems in static models 

As mentioned in the introduction the results in this paper are partly based on a coupling between 
the configuration model and branching processes, presented in two previous publications {19] and 
[20] . For later reference, we will summarize the graph distance results obtained in these papers and 
in [T7J, for the case r G (1, 2). 

The graph distance H N between the nodes 1 and 2 is defined as the minimum number of edges 
that form a path from 1 to 2. By convention, the distance equals oo if 1 and 2 are not connected. 
Observe that the distance between two randomly chosen nodes is equal in distribution to H N , 
because the nodes are exchangeable. The main result in [TTJ is that for r G (1,2) and in the limit 
for ./V tending to infinity, the distribution of the graph distance is concentrated on the points 2 and 
3, i.e., when Assumption 11.11 holds . then, 

lim F{H N = 2) = 1 - F{H N = 3) = p, (1.14) 

N— >oo 

where p = pf G (0,1)- For r G (2,3) we showed in [20], that when Assumption 11.11 holds, the 
fluctuations of H N around 

2 M l0gbgi \, (1.15) 
|log(r-2)| V ; 

are O p (l) as N — > oo; and finally, we showed in [19] that the same result holds for r > 3, with the 
centering in (j!.15[) replaced by 

log, AT. (1.16) 
The model studied in this paper with r G (2,3) is also studied in [25], where it is proved that 

log log N 
|log(r-2)| 



whp the graph distance H N is less than 2 i|°?, 1 ° g ^, + 2k(N), where 



2 



3 - r 



£(N) 



k(N) = exp £(N) with lim - — - — ^ = oo. (1.17) 



N— >oo log log log log N 



At approximately the same moment the log log A^-scaling result appeared in the physics literature 
[llj . where it was derived in a non-rigorous way. The distance results in [19] was generalized to a 
much larger class of random graphs in [16] . 

There is substantial work on random graphs that are, although different from ours, still similar in 
spirit. In [Tl 1121 [13} 122 ] [23] . random graphs were considered with a degree sequence that is precisely 
equal to a power law, meaning that the number of nodes with degree k is precisely proportional to 
k~ T . A second related model can be found in [HE], where edges between nodes i and j are present 
with probability equal to WiWj/ ^ wi for some 'expected degree vector' w = (wi, . . . , w N ). In [TO] , 
these authors study a so-called hybrid model. 

Arratia and Liggett [3] study whether simple graphs exist with an i.i.d. degree distribution, 
i.e., graphs without self-loops and multiple edges. It is not hard to see that when r < 2 this 
happens with probability (since the largest degree is larger than N). When r > 2, however, this 
probability is asymptotic to the probability that the sum of N i.i.d. random variables is even, which 
is close to 1/2. When r = 2, the probability can converge to any element of [0, |], depending on 
the slowly varying function in (j 1 . 2|) . A similar problem is addressed in [7J, where various ways how 
self-loops and multiple edges can be avoided are discussed. Among others, in [7], it is proved that 
when the degrees are i.i.d. and all self-loops and multiple edges are removed, then the power law 
degree sequence remains valid. 

There are many open questions remaining in the configuration model. For instance, in [JJJ], we 
have shown that for r > 3, the largest connected component has size qN, where q is the survival 
probability of the delayed branching process. All other connected components have size at most 
7logiV, for some 7 > 0. For r G (2,3), such a result is given in this paper under the extra 
assumption that ji > 2. It would be of interest to investigate whether the same result holds for 
r < 3 and general \i when q > 0. 
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A second quantity of interest is the diameter of the graph which is important in many applica- 
tions. For instance, in the Internet, a message is killed when the number of hops exceeds a finite 
threshold. Thus, it would be interesting to investigate how the diameter grows with the size of the 
graph. The result in Theorem 11.51 is a lower bound in the case when fi > 0, whereas Theorem 
11.61 gives an upper bound for r G (2, 3), when j\ + fi = 0; however a better understanding of the 
diameter is necessary. 

An important property of the topology of a graph is its clustering, which basically describes 
how likely two nodes that have an edge to a common node are to be connected by an edge. In 
general, in random graphs, this clustering is much smaller than the clustering in real networks. 
It would be of interest to investigate graphs with a higher clustering in more detail. The hybrid 
graphs in [10] are an important step in that direction. 



1.4 Organization of the paper 

The paper is organized as follows. In Section [2j we prove Theorem 11.41 and in Section [31 we prove 
Theorem ll.2i Finally, in Section we prove the lower bounds on the second largest connected 
component in Theorem 1 1 . 3 1 and on the diameter in Theorem 11.51 and 11.61 . 



2 Connectivity properties 

In this section we prove connectivity properties of the random graph defined in Section 11.11 In 
particular we will prove Theorem [T3J which states among other things that ^(Cjv > k), where Cjy 
denotes the number of nodes in the complement to the largest connected component of the random 
graph, is exponentially bounded as ./V — > oo, when ¥(D > 2) = 1 and 2 < fj, < oo. Throughout the 
paper, we write I[E] for the indicator of the event E. 

We start by stating a lemma which bounds the conditional probability P JV (C JV > s), where Pjv 
denotes the probability given the degrees D\ , . . . , D N . 

Proposition 2.1 Let r G {1,2}, and assume that T{D\ > r) = 1. Then, for any 1 < s < N/3, 

N-s f 2N 2 -\^ rm 
^n{C n > s) < 2^2 I — I , a.s. (2.1) 

We first show that Theorem ll.4l fi). (iii) and (iv) are an immediate consequence of Proposition 12.11 
Theorem ll.4( ii) is proved in Section [4.31 below. 

Proof of Theorem I1.4K 1). (iii) and (iv). We start with case (i) where 2 < [l < oo, and 

F(D > 2) = 1. We denote by fi N = L N /N. Taking expectations on both sides of (|2.ip . yields for 
r = 2, and with 1 < s < N/3, 

P(C JV >s) <E f 2J2 - 1 + j"/ 2 ] j +ff ) (/"iv < I + m/2) (2.2) 



2 



3=s 



2 



.1+7172 J , , . 2(2 + u) / 4 V JN 

< r 2 +p(^ < 1 + ^/2) < J — yM^— + e < 

1 ~ T+7172 ^ z V z + M/ 

where I is the exponential rate of the event fj, N = L N /N < (1 + f-i/2), which is strictly positive 
since {1 + fi/2 < /i}. Indeed, the final inequality of (|2.2p holds for all N > 1, because of Chernov's 
bound and the fact that fi > 2, and that for t > the Laplace transform E[exp{— t-Di}] exists. For 
2 < fi < 00, and F(D > 2) = 1, this shows that (fTTTU]) holds for all N > 1 and 1 < s < N/3, by 
taking a = max{e" 3 ', 2+71}' anc ^ b = 1 + -7^2 ■ r ^^ ie statement for A^/3 < s < N follows from: 

P(C JV > s) < F(C N > N/3) < ba N/3 = ^(a 1 / 3 )^ < ^(a 1 / 3 ) 5 . (2.3) 
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This proves (i). 

Consider next the case (iii), where \i = oo and P(Z) > 2) = 1. Then, for any e > and large 
enough N, 

'2N \ 

Hence, the probability that the random graph is disconnected, or P(C N > 1), is due to (|2.1|) at 
most 

E ^E(l^j /[2iV< £i „]j + p(-> E ]<2|> +E < — + (2.4) 

Since e > can be chosen arbitrarily small the probability that the graph is disconnected tends to 
zero, as N tends to infinity. 

We complete the proof with the case (iv), where L N /N 2 — > oo, in probability, which for example 
is the case when t € (1,1). In this case, we take r = s = 1, and we use the assumption that 
N 2 /L N — ► in probability as N — > oo, to see that 

N-l 

F(C JV > 1) < 2 (2e) [j/21 + W(N 2 /L N > e) < 9e, (2.5) 
j'=i 

for each e > 0. □ 

Proof of Proposition 12.11 For any s < N/3, we estimate the probability that Cjy > s. If 
Cjv > s, then there exist two disjoint sets of nodes, one with s < j < N — s nodes and another 
with N — j, such that all stubs of the first set pair within the first set and all stubs of the second 
set pair within the second. To see this, we note that when the largest component has size at least 
s, then the statement is correct, and the two disjoint sets are the nodes in the largest component 
and the ones outside of the largest component. Thus we are left to prove the statement when the 
largest component has size at most s — 1. In this case we order the connected components by size 
(and when there are multiple components of the same size, we do so in an arbitrary way). Then 
we start with the largest component and we successively add to it the largest component that is 
still available, until the total size of these connected components is larger than s. Since the largest 
component has size at most s — 1, the total size we end up with is in between s and 2s. Since 
s < N/3 we obtain 2s < N — s, and we arrive at the claim that the set consists of a number of 
nodes which is in between s and N — s. Put the chosen connected components into the first set 
and all remaining nodes in the second. By construction there are no edges between these two sets 
of nodes. 

Our plan is to show that the probability that the first set of size j, where s < j < N — s, pairs 
within its own group and the second set of size N — j also pairs within its own group is bounded 
by the right side of (|2. 1 j) . We use Boole's inequality to bound the probability of the union of all 
possible choices for the first set. 

To this end, let i±, . . . , ij be the nodes and Aj = Y2i=i be the total number of stubs in the 
first group, and k\, . . . , k N _j and B N _j the nodes and number of stubs in the second group. We 
remark that Aj + B N _j = L N , and since the groups are not connected, both Aj and B N _j are even. 
The Pjv-probability that the groups are not connected is then, for each fixed choice i±, . . . , ij, equal 
to 

£-1 [ ( A J - 2n - 1) 
TT Aj-2n-l = n=0 = yr 2m + 1 

J-J- L N -2n-l 11 £ 2m- 1" 1 ' ' 

n=0 N 2 1 m=0 N 

H (L N - 2n - 1) 

n=0 
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By symmetry between Aj and B N -j, we also obtain that this probability is equal to 

B N-j 1 

L N - 



n t^S^t- (") 

m=0 



Observe that for integers j > 0, the map 

j 

n2m + 1 L N 1 

X _2m - 1 iS decreasm S for J ' - ~ ~ 2' ^ 2 ' 8 ' 

Suppose that Aj < L N /2 — 1. Then we use (12.6|) . Due to P(-Di > r) = 1, and since A,- is even we 
have \jr/2~\ < Aj/2 < L N /4 — 1/2 a.s., and hence, by (|2.8p . the final expression in (|2.6p is at most 

rjr/21-l 

nim + 1 
L„ - 2m - 1 " 

m=0 

Suppose that Aj > L N /2. Since Aj + i3jv_j = L N and P(Di > r) = 1, we have then that 
\{N - j)r/2] - 1 < B N _j/2 - 1 < L N /4 - 1/2 a.s., and we estimate (ETFj) . by (^8]) . by 

r(JV -g /21 ^ 2m +1 



- 2m - 1 

m=U 



Hence, the Pjv-probability that the two groups of nodes are not connected is at most 

n r^'^«»i + n T^hmA,>L„ ft] 



[jr/i\-l \(N-j)r W-l 

nzm + 1 -r-i- 2m + 1 

r -9 m _i + 11 



(2.c 



Ljv — 2m — 1 L N — 2m — 1 

m=0 m=0 



For 1 < j < N — 1, we have at most ( . ) ways to choose j nodes i\, . . . Hence, by Boole's 
inequality, 



p*(c w >,)< E-iT^-nT II y— ^— r+ II 



j!(JV - j)! I Ljv — 2m — 1 n — 2m — 1 

j=s V m =0 m=0 

iY ~ s / J' -1 AT \ Z^/ 2 !- 1 /at N/o , 1\ 

^ ~~ 771 I I TT l-^ ~~ m)(2m + 1) 



(2.10) 



2 E n n 



m + l / I - L1 (m + IX-Ljv - 2m - 1) 



where by convention the product of the empty set equals 1, and where we used symmetry (between 
s and N — s) together with the identity 

Nl t— r- N — rn 



n 



For the remaining part of the proof we will make use of the following lemma: 
Lemma 2.2 For any 1 < k < N — \, 

fc-1 

(m + l)(2iV-2m-l) 



n ( N - m )<? m + i ) <! (2111 



m=0 
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Proof. Define, for < m < N — 1, 

(N - m)(2m + 1) 



h(m) 



(m + l)(2N -2m- 1)' 
Then 

h{m) < 1, if m < (N - l)/2, 
h(m)h(N - m - 1) = 1, for all < m < TV - 1, 
/i((iV- l)/2) = 1, if iV is odd. 

Hence, (pHT) is trivial for k < (N - l)/2 + 1. If > (JV - l)/2 + 1 then N - k < (N — l)/2, and 

Cfc-i \ 2 / fc-i \ 2 fe-i 

JJ/i(m)J <( Yl K m )) = II h (™)h(N - m - 1) = 1. 
m=0 / \m=N-k } m=N-k 

Thus we have (j2~TT|) for all 1 < k < N - 1. □ 

We now finish the proof of Proposition 12.11 when r = 2. In this case, due to (|2.1ip . the right side 
of (I2.10|) . with r = 2, is at most 

^'tr 1 (JV-m)(2ra + l) tftJ 1N-1m-\ 'Uf(2N\> , , 

2 E n (m ' + 1)(L ( _ 2m - 1) - 2 S II I^TI ^ 2 E(^) ■ <"2) 

since > 2iV, a.s. This completes the proof when r = 2. 
When r = 1, the right side of (|2.10p equals 

2 y ( 'ft ^] ("ft 1 >(2, " +1> 1 (2 13) 

which, due to (|2.1ip . is at most 

2 E n . ^tt n . i^rrr ^E^ /2J (^ ^E(t7 • 

j=s y m =[j/2] / \ m=0 y j=s ' j=s 

This completes the proof of Proposition 12.11 □ 



3 On the connected component sizes 

In this section, we investigate the largest connected component in more detail and prove Theorem 
11.21 We start with some definitions. For 5, e > 0, we define 7^ = / y N (5,s) by 

In = 1 ~7^ — log N, (3.1) 

\ogfi N - log2 - 

where as before [i N = L N /N . We also define a deterministic version of 7^ in the following way: 

In = ; ^ — N, (3.2) 

log ^-^2-^ 

where fi is a deterministic sequence for which 

In >/0 = l-o(l), iV^oo. (3.3) 
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We start by formulating a version of Theorem 11,21 that is valid under the Pjv-probability. Before 
stating this theorem, we need a number of assumptions. Define 

1 - 

Qn = -J^lMICil >7iv), (3.4) 

i=l 

where C{ is the connected component that contains i and \C\ denotes the number of nodes in 
C C {1, . . . , N}. We assume that 

_J— ^^P„(i,j connected) = <&(l + o(l)). (3.5) 

Note that (|3.5j) is an assumption involving the P^-probability. We can interpret (13. 5j) as saying 
that, under P^, a large proportion of nodes i,j for which the connected component consists of more 
than 7^ nodes, are connected. 

Before we proceed with the preliminaries of the proof of Theorem 11.21 we give an outline of this 
proof. Denote for r > 2 by q the survival probability of the branching process {Z{\ and set q = 1 
for r € (1,2). We will show, using the coupling in [TOJ that 

(i) q N — > q, see Lemma 13.71 below, 

(ii) Var N (X N ) = o(N 2 ), where X N = ^i=i > 7jv] i see Lemma 13.81 below. 

Having verified these two items, we can apply Proposition 13.61 below, because L N > 2N follows 
from n > 2 when r > 2 holds, and is immediate for r G (1,2). In the proof of Proposition 13.61 we 
will order the connected components according to their size: |C (1) | > |C (2) | > and will prove 
that 

P(|C( 2) | >7iv) = o(l), 

in two steps. In the proof of these two steps the statement of Lemma 3.4 below: the probability 
that there exists a connected component with at most rjN nodes, and in between eL N and (1 — e)L N 
stubs, is exponentially small in N plays a prominent role. We then finish with a proof that whp, 

N 

x n = ^2l[\Ci\ > j N ] = J2\ c w\ I i\ c w\ ^ In] = |C ( i)|/[|C ( i)| > 7at] = |C (1) |, 

i=l I 

and reach the conclusion that X N = is of order Nq N by showing that P(|Xjv — Nq N \ > 

u> N Var(X N )) = o(l), for each sequence uj n — > oo. 

We now turn to the preliminaries of the proof of Theorem 11.21 

Theorem 3.1 Assume that: (i) L N > 2N , (ii) Relation 13. 5\) holds, and (Hi) q N > e as N — > oo, 
for some e > 0. Then, whp, under ¥ N , the largest connected component in G has q N N(l + o(l)) 
nodes, and all other connected components have at most ^y N nodes. Moreover, whp, under ¥ N , the 
largest connected component has in between q N N(l ±u N ^J~^) nodes for any m N 



oo. 



Remark 3.2 Observe that besides the results on the sizes of the components, Theorem \ 3.1\ also 
includes a bound on the fluctuation of the size of the largest component. 

The remainder of this section is organized as follows. In Section [3.11 we prove Theorem 13.11 In 
Section 13.21 we use a modification of the proof of Theorem 13.11 to prove Theorem 11.21 
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3.1 Connected components under ¥ N 

We start with a proposition that shows that the connected components, measured in terms of their 
number of edges, are either quite small, i.e., less than 7^, or very large, i.e., a positive fraction of 
the total number of edges. 

Proposition 3.3 Fix S > 0, assume that fi N > 2, and let < e < be such that log fi N — 
log 2 — j^2e ^ 0- Then, the ¥ N -probability that there exists a connected component with in between 
In = 7jy(£ 5 ^) an d £L N edges is bounded by 0(N~ S ). 

Consequently, when [i > 2 or [i = 00, the ¥ -probability that there exists a connected component with 
in between 7^ and eL N edges converges to 0, as N — > 00. 

Of course, we expect that there is a unique such large connected component, and this is what 
we will prove later on. Note that when r € (1, 2), then, with large probability, fi N = L N /N > N v , 
for some 7] > 0. In this case, we even have that 7^ is uniformly bounded in N, and thus, the 
connected components that do not contain a positive fraction of the edges are uniformly bounded 
in their number of edges. 

Proof of Proposition 13.31 We adapt the proof of Proposition 12.11 Denote by k the number of 
edges in the connected component with in between 7^ and eL N edges. Then, we must have that all 
the 2k stubs are connected to each other, i.e., they are not connected to stubs not in the k edges. 
This probability is bounded by 

^r 1 2k - 2n - 1 _ 2m + 1 

11 r _ 9-n _ 1 ~~ 11 



„ L N — 2n — 1 *-\L N — 2m — V 

n=0 m=0 



ignoring the fact that the component needs to be connected. We first prove the statement for 
k < — 1) A eL N , and in a second step prove the statement for (y — 1) A eL N < k < eL N . We 
abbreviate R N = — 1) A eL N . 

Denote the number of nodes in the connected component by I. Note that when a connected 
component consists of k edges, then I < k + 1. Therefore, the total number of ways in which we 
can choose these I nodes is at most (j) < (^J, when k < y — 1. Thus, the F N — probability that 
there exists a component with in between 7^ and R N edges is bounded by 

Rn / at \ o 1 1 Rn at\ k ~ l 

E/ TV \ -■ — r 2m +1 \ - JS\ 7, -1 r . -, 

(* + J n T^nsm * E jw^f* n >» - *» - ' 

fc=7jv m=0 fc=7jv m=0 

E, 2 sfc -r-r , 2m + 1 , 

k=7jv m=0 

Next, we use that 1 + x < e x for x > 0, to obtain that the Pjy— probability that there exists a 
component with in between 7^ and R N edges is bounded by 

fe=7jv ^m=0 fe=7JV 



gi.jV- 



fc=7jv 



E/ 2 , fc fee I / 2 g \ 7JV , 

( — )e^<7? _1 iV( — e~) , (3.7) 



provided that y^-e 1 - 25 < 1 — rj. The right side of (|3.7[) is bounded by rj 1 N s for the choice of 7^ 
in (I3TTD. 
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We complete the proof by dealing with the case that R N < k < eL N . In this case, we must 
have that R N = y — 1, otherwise there is nothing to prove, so that k > f . Then, we bound the 
total number of ways in which we can choose the I < k + 1 nodes by 2^. Since 2 N < 2 2k , for all 
k > N/2, we arrive at the fact that the probability that there exists a connected component with 
in between y and eL N edges is bounded by 

Y2*T\ 2m + 1 <Y2™(-^-) k <±^- (-*?-) , (3.8) 

which is exponentially small in iV when e < j^. Thus, this probability is certainly bounded above 
by N~ 5 . 

For the bound on the P-probability that there exists a connected component with in between 
7jv and eL N edges, we denote by F{k, I) the event that there exists a connected component with in 
between k and I edges. Then we can bound 

F(F(j n ,eL n )) < F(fi N < u )+E[l[fi N >/£ }F N {F{ lN ,eL N ))], (3.9) 



where we use that ^ N > ^y N when fi N > fj, , choosing [i = (fj, + 2)/2 for fi < oo and fj, = 3 for 
H = oo. The first term is o(l) by (|3.3p . the second term is small by the estimate P jv (-F(7j V , sL n )) < 
N~ s proved above. □ 

We next present a lemma that will be used in the proof of Theorem I3.U 

Lemma 3.4 Fix e > and < r/ < e sufficiently small. Then, when L N > 2N , the IP N -probability 
that there exists a connected component with at most r]N nodes, and in between eL N and (1 — e)L N 
stubs, is exponentially small in N. Consequently, the same estimate is true for the P-probability of 
this event provided that W(L N < 2N) is exponentially small. 

Proof. Take < rj < e. Again denote by k the number of edges in an arbitrary connected 
component satisfying ^L N < k < ^-^-L N . Then, we must have that all the 2k stubs are connected 
to each other, i.e., they are not connected to stubs not in the k edges. The P^-probability of this 
event is bounded by 

y=r 2fc - 2n - 1 ty 1 2k - 2n _ k-n " X 
l} Q L N -2n-l- UL^to ~ " V k 

We next use that the number ways of choosing at most rjN nodes, with r] < ^ is bounded from 
above by 

Therefore, the Pjv-probability that there exists a connected component with in between eL N and 
(1 — e)-^ stubs and at most rjN nodes is bounded by 

: NL N exp{ Crt N(l + o(l))} exp{-c £ ^(l + o(l))}, (3.11) 



where we have bound the sum by the number terms times the largest summand, and where we have 
used that for n small, ( ^) = e Cr]N ( l+ °^\ where c v | as n j 0. Therefore, using that L N > 2N, 
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it suffices to take r] > so small that c v N < (c £ — 5)-^-, for some 5 > sufficiently small, to see 
that this probability is exponentially small in N. 

For the statement involving the unconditional probability we denote by G £irj the event that 
there exists a connected component with at most r/N nodes, and in between eL N and (1 — e)L N 
stubs. Then 

P(G e ,„) = E[F N (G £ , V )} < F(L N < 2N) + E[F N (G £ , V , L N > 2N)] (3.12) 
and both terms are exponentially small. This completes the proof of Lemma 13.41 □ 

We are now ready for the proof of Theorem 13.11 

Proof of Theorem l3.ll Take e, 5 > and fix 7^ as in (|3.1|) . Recall that Cj denotes the connected 
component that i belongs to. We define the random variable X N by 

N 

X N = ^I[\Ci\> lN ], (3.13) 
i=l 

so that X N equals the total number of nodes in connected components of size at least 7^. By (|3.4p . 

N 

E N [X N ] = ^> W (|C;| > 7w ) = Nq N , (3.14) 

i=l 

where E N is the expected value under F N . We first prove that the variance of X N under the law 
F N is small, so that X N is with high probability close to Nq N : 

Lemma 3.5 With probability 1, 

T 2 iV 2 

Var^A^) = Nq N (l - q N ) + 0(^—), (3.15) 
where Xax N denotes the variance under F N . 

Proof. Without explicit mentioning all statements in the proof hold with probability 1. We first 
note that V&r N (X N ) = V&r N (N — X N ) = V&r N (Y N ), where 

N 

Y N = ^2l[\d\<j N }. (3.16) 

i=i 

Therefore, 

Vax N (Y N ) = 5>„(|Ci| < 7iv, \Cj\ < 7iv) - [N(l - q N )] 2 (3.17) 
hi 

= £>„(|Ci| < lN ,\C,\ < 7n) + N{1 - q N ) - N 2 (l - q N f . 

For the first term we use the coupling in [191 Proof of Lemma A. 2. 2], with N^~ v replaced by 7^, 
to obtain that 

< lN ) = F N (T Z?M < j N ) + O(i), (3.18) 
1 W 



where {Z^' }i>\ is a branching process with offspring distribution 



g m = !±+lY f I[D j = n + l], n>0, (3.19) 

3=1 
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and with Z^' N) = Di, the degree of node i. The coupling is described in full detail in [191 Section 
3], whereas the bound in ()3.18|) follows from the proof of [19[ Lemma A. 2. 2], which holds under 
the Pjy-probability and is therefore true for any degree sequence, and hence in particular for each 
r > 1. 

Moreover, for i / j, it is described in [191 Section 3] that we can couple |Cj| and \Cj\ simultane- 
ously to two independent branching processes to obtain 



2 

(\d\ < lN ,\Cj\ < lN )=F N (£zf> N) < lN )F N (J2^' N) < In) +0(y^)- (3.20) 



/ / 

Therefore, 



N 



£ P*(|Ci| < 7*, \Cj\ < 7.) = E ME z?' N) < -y»)v»(E Zi ,N) < 7iv) + O(^) 



i^j ij^j I I 

■ N x2 

J>rO, (M0 < 7*)) -(l-q N ) 2 N + 0( 



i=l I 

>,2 AT2 



L, 



<■ 2 ^ N 2 

-(1-q^N + oC' 
(N 2 -N)(l- q N f + 0@*—), (3.21) 



using ([331) and (H3H]). So, substituting (13311) into (pUTjl . 



^2 jy-2 

Var iV (X iV ) = Yav N (Y N ) = Nq N (l - q N ) + 0(^—). (3.22) 

□ 

We continue with the proof of Theorem 13.11 which is a consequence of the following proposition. 
This proposition will also be used to prove Theorem 11.21 below. In its statement, we let Q be a 
probability distribution, which we will take to be ¥ N in the proof of Theorem 13 . 1 1 and P in the proof 
of Theorem O Let 7^ = 7^ when Q = F N and 7* = 7^ when Q = P, see (^TJ and (HT2J) for the 
definitions of 7^ an 7^. Furthermore, we take X N = 1[\Ci\ > 7^] and define 

1 N 

i=l 

Proposition 3.6 LetQ = ¥ orQ = F N . Suppose that (i) L N > 2N , (ii) Var Q (X N ) < B N = o(N 2 ), 
and 

{Hi) E Q [X N ] = Nq* N , ^2Q(i,j connected) = (Nq* N ) 2 {l + o(l)), (3.24) 

hi 

where q* N > e for some e > 0, as N — > 00. Then, 

(i) whp the second largest component has at most 7^ nodes; 

(ii) whp the largest connected component has in between Nq^ ± oj N yJB N nodes for any lo n — > 00, 
such that uj n \JB n = o(N). 

To prove Theorem I3.1[ we use the above with Q = P N and B N = C . 
Proof. We define the event 

E N = {\X N - Nq* N \ < uJ N y^}. (3.25) 
Then, by the Chebycheff inequality, 

Q(E C N ) < ^ v / ^)~ 2 Var (X iV ) < a;" 2 = o(l). (3.26) 
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We write C (1) ,C( 2 ), . . . for the connected components ordered according to their sizes, so that 
|C(i)| > |C( 2 )| > . . . and C w and C {j) are disjoint for i / j. Then we clearly have that 

J2®(hJ connected) = ^Q(|J{*,j G C co }) = ^J^Qihj G C (0 ) = ]T E Q [|C (I) | 2 ]. (3.27) 

i,j ij I I i,j I 

Combining with (|3.24p we get, 

]TE (Q [|C (0 | 2 ] = (iV^) 2 (l + O (l)). (3.28) 
l 

Furthermore, 

^E [|C (i) | 2 I[|C (i) | < 7 ;]] < 7;E E 4l C (ol^[|C(ol < 7^]] < 1* N N. (3.29) 
1 1 

Therefore, since 7^ = O(logiV) = o(N) and q* N > e, we obtain that 

^E Q [|C (0 | 2 /[|C (i) | > 7 *]] = (Nq%)\l + o(l)). (3.30) 
l 

By ()3.26j) . we thus also have that 

E [^|C (0 | 2 /[|C (0 | >7* N ]I[E N }] = (Nq* N )\l + o(l)). (3.31) 
I 

We will now prove that 

Q(|C (2) | > 7 ;) = o(l). (3.32) 
This proceeds in two key steps. We first show that for some 77 > sufficiently small 

Q(|C {2) | > 7*) = Q(|C ca) | > V N) + o(l), (3.33) 

and then that the assumption that 

limsupQ(|C {2) | > r]N) = 9 > 0, (3.34) 

leads to a contradiction. Together, this proves (|3.32|) . We start by proving (|3.33|) . We note that 
we only need to prove that Q(|C (2 )| > 7^) is less than or equal to the right side of (|3.33[) . since the 
other bound is trivial (even with o(l) replaced by 0). 
To prove (|3.33p . we split for i = 1, 2, 

Q(|C (i) | >j%)=Q(\C (i) \ >l* N ,\C (i) \ b <eL N ) + Q(\C (i) \ > ~/* N ,\C (l) \ b > eL N ), (3.35) 

where \C\b denotes the number of edges in C. Since \C\b > \C\ — 1, for any connected component C, 
by Proposition 13.31 f° r an Y 5 > 0, and for 2 = 1,2, since 7^ = 7^ or 7^ = 7^, where whp 7^ > 7^, 
we obtain 

Q(|C W | > 7^, |C W | 6 < eL N ) < Q( 7 ; < |C W | 6 < eL N ) = o(l), (3.36) 

so that 

Q(|C (2) | > 7*) = Q(|C (a) | > 7;, |C (1) | 6 > eL N , \C {2) \ b > eL N ) + o(l). (3.37) 

By Lemma l3T4"| and because {|C (1) |;, > eL N } {\C^ 2 )\b < (1 — z)L N }, we further have that for 77 > 
sufficiently small 

Q(|C ( 2)| < VN, \C w \ b > eL N , \C (2) \ b > eL N ) = o(l), (3.38) 

Therefore, using (|3.37|) 

(2)1 > 7^) < Q(|C (2 )| > r)N, \C w \ b > eL N , \C (2) \ b > eL N ) + o(l) < Q(|C (2) | > V N) + o(l). (3.39) 



16 



This proves (|3T33l) . 

We next prove that (j3.34[) is in contradiction with (13.311) . Observe that 



N 



X N =Y,I[\d\ > 7*] = E l C (ol^[|C(ol > 1*A (3-40) 



i=l 



so that, on the event E N , we have that Yli |C (i) |/[|C ( ;)| > 7^] = iVg^(l + o(l)). Using (j3.40p we can 
bound 

E l c <ol 2j [l c Col > 7^] < |C (2) | 2 + (EfooWcol > 7^]) 2 = |C (2) | 2 + - |C (2) |) 2 . (3.41) 

We split the expectation in ()3.3ip by intersecting with the event {|C (2 )| > r]N} and its complement: 

Eo[E|C (0 | 2 /[|C (i) | >Y N ]I[E N ]} = Eq[EI C (oI 2 ^[|C(oI >7^[^n{|C (a) | > V N}}] (3.42) 
l 1 

+E Q [E \C m \ 2 I[\C (l) \ > i* n ]I[En n {|C (2) | < TjN}] . 
I 

We next use a simple calculus argument. For i]N < x < y/2, the function x 1— ► x 2 + (y — x) 2 is 
maximal for x = r]N. We apply the arising inequality to the right side (|3.4ip . with x = |C (2) | and 
y = X N > \C m \ + \C(2)\ > 2|C (2) | = 2x, so that, 

E Q [El c (ol 2/ [|C(ol >7^R[^n{|C (a) | > V N}]} 
I 

< E [(|C (2) | 2 + (X N - \C (2) \) 2 )I[E N n {\C {2) \ > V N}}] 

< E Q [(rj 2 N 2 + (X N - r, 2 N 2 )I[E N n {|C (2) | > V N}}} 

< (v 2 + (q* N -v) 2 )N 2 Q(\C (2) \>7 1 N)(l + o(l)). (3.43) 

where we used in the last step that on E N we have X N = q%N(l + o(l)), because uj N y/B N = o(N). 
On the other hand, we have, on the event E N , using again that uj n ^JB n = o(N), 

E \c m fm m \ > 7;] < ( E icoi^oi > 7;d 2 =x% = (a^) 2 (i + o{i% (3.44) 
i i 

implying that 

E [E l C (ol 2 ^[|C (i ,l > llV[E N n {\C m \ < V N}]] < Q( 7 * < |C (2) | < 7/JV)(JV g *) 2 (l + o(l)). (3.45) 
i 

Together, (^42]) . (I3^3|) and (1343]) yield 

E Q [El C «| 2j [l C (0l>7^^]] (3.46) 



< 



[(^ + (Q* N - r?) 2 )iV 2 Q(|C (2) | > r?iV) + (q* N ) 2 N 2 Q( 7 * N < |C (a) | < r/iV)] (1 + o(l)), 



so that the assumption that limsup^^^ Q(|C (2) | > rjN) = 9 > is in contradiction with (|3.3ip . 
because assuming both (|3.3ip and (|3.46p would imply that rj > limsupq^ = e, since for < rj < 
we have rj 2 + (q% — rj) 2 < (q%) 2 . This proves that the assumption in (|3.34p is false, and we conclude 
that (|3.32p holds, which proves the claim for the second largest component. 

We now prove that whp the largest component has size in between Nq^ ± lv n \/B n for any 
u> N — ► oo. For this, we note that on the event that the second largest component has size less than 
or equal to 7^, we have (compare (13.40P ). 

X N = E \C m \I[\C m \ > 7;] = \C W \I[\C W \ > 7*] = \C W \. (3.47) 
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By (|3,26p and (|3.32p . which is now established, we thus obtain that 

Q(\C (1) - Nq* N \ >u N y/B?) < ®(\X N - Nq* N \ > u„y/B^ + Q(|C (a) | > 7*) 

= Q(K) + Q(|C (2) | > j* N ) = o(l). (3.48) 

This completes the proof of Proposition 13.61 □ 
The proof of Theorem 13. II follows from Proposition 13.61 by taking Q = ¥ N , and B N = C = 
o(N 2 ). For this we note that Ejvf-Xw] = Nq N follows from (|3.4h . The second assumption in (I3.24|) 
follows from (|3.5p and Lemma 13.51 as follows: 

^2 3 connected) = ^ j € C (0 ) = ^E^[|C (i) | 2 I[C (!) > 7jv ]] + o(N 2 ), (3.49) 

i,j i,j I I 

because 7^ = o(N). In turn: 

J2 \C m \ 2 I[C m > 7iv] = > 7iv, \C U) \ > 7w ], (3.50) 

so that 

^E W [|C (0 | 2 J[C (0 > j N ]]=M N [X 2 N ] = (E N [X N ]) 2 +V a r N (X N ) = N 2 q 2 N (l + o(l)), (3.51) 
l 

because Lemma 13.51 stated that Yar N (X N ) is of order N. □ 
3.2 Proof of Theorem [TT21 

The proof of Theorem 11.21 will be given by verifying the conditions of Proposition 13.61 with Q = P 
and 7^ = 7^ defined in (|3.2p . In order to do so we will use results proved in [19] for r > 3, [20] for 
r G (2, 3) and [1^ for r G (1, 2) (when we apply these results we will give more specific references). 

We now turn to the proof of the theorem in question. Because in the configuration model the 
nodes 1,2, ... ,N are exchangeable, 

E[X N ] = NF{\d\ >7 JV ), (3.52) 

and this identifies q N = P(|Ci| > 7jv) (see (13. 4p ). We next note that (again using that the nodes 
1, 2, . . . , TV are exchangeable), 

^2 3 connected) = N(N - 1)P(1, 2 connected) + N. (3.53) 

y 

In [H p. 99, Equation (4.22)] (case r > 3), [2Ql (4.96)] (for r G (2,3)), it was shown that 

P(l, 2 connected) = q 2 (l + o(l)), (3.54) 

where q is the survival probability of the delayed branching process {Zi}i>i. For r G (1,2) we 
showed in [17\ Theorem 1.1] that the graph-distance between between 1 and 2 is whp either equal 
to 2 or to 3, so in this case (|3.54p holds with q = 1. 

Comparing the conditions of Proposition 13.61 and those of Theorem 11.21 shows that in order to 
use Proposition 13.61 it remains to show that: (i) q N = q+o(l), and (ii) to give a bound B N = o(N 2 ) 
on Var(X JV ). This is indeed so, because Ljy > 2/V follows from fi > 2, when r > 2 or is immediate 
from r G (1,2). We prove (i) in Lemma 13.71 and (ii) in Lemma 13.81 below. 

Lemma 3.7 q N = q + o(l). 
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Proof. We have that 

q N = P(|Ci| > 7jv ) = 1 - P(|d| < 7„). (3.55) 

Using (|3.18p . we obtain 

<7^) =E[P J v( <7v)] +oA. (3.56) 
i ^ 

The coupling is described in full detail in jl9} Section 3, p. 87], whereas the bound in (I3.18|) 

and hence (|3.56p follows from the proof of \19\ Lemma A. 2. 2, p. Ill], which holds under the 
Pjv-probability and is therefore true for any degree sequence. Therefore, 

q N = 1 - E[F N { V Z^ N) < j N )] + oA. (3.57) 
We start with r G (1,2). We note that with probability 1, 
ME^° < 7*) < P^C^ < 7.) < E<T = E Y~ l \ D i < 7* + 1] < ( ^\ +1)Ar - (3.58) 



( n=l i=l 

Therefore, since by dominated convergence both E[^-] — ► and E[^^tlMj _> we conclude that 
g w = 1 — o(l), when r G (1, 2). 

We next turn to r G (2, 3) and r > 3, which we treat simultaneously. For this, we use that we 
can prove by coupling (see [HI Section 3, p. 87]) that 



¥ N Q2Z^ N) < 7v) = IM£^Z < 7v) + 0(j nPn ) = P(E^ < Tat) + 0(7^"), (3.59) 
where is the total variation distance between {<7n } and {g n } given by 

. oo 

P^=2El^-S»l» ( 3 ' 6 °) 

n=0 

and where the second equality in (|3.59p follows since the offspring distribution of {i^};>o does not 
depend on the degrees Di, D%, ■ ■ ■ , D N . 

In [19l Proposition 3.4, p. 92], it is shown that for r > 3, and some 02,^2 > 0, 

F(p N > N-° 2 ) < A^ 2 . (3.61) 

In [191 Remark A. 1.3, p. 107], the same conclusion is derived for r G (2,3). Therefore, 

q N = 1 - P( VZ, < 7 iv) + 0(^-) + 0(A"*) + 0( 7jv jV- a2 ), (3.62) 

so that, in turn, 

q N = q-¥(j N <J2 2 l<°°)+°{ 1 )- ( 3 - 63 ) 
z 

We have that 

p(tv < E- 2 ^ < °°) = ( 1_ ^) Ip (E Zz > 7jv i extincti ° n )- ( 3 - 64 ) 

I I 
A supercritical branching process conditioned on extinction is a branching process with law 

g* = (l-q) n - 1 g n (n>l), = 1 - (3-65) 

n>l 
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Indeed, if F n is the event that Z\ has n children in the first generation, then 

F(Z[ dies out, F n ) = <7 n P(n copies of Z\ die out) = (1 — q) n g n - (3.66) 

It is not hard to see that g* is a subcritical offspring distribution, and it clearly has finite mean. 
Therefore, in particular, the total progeny has finite mean (in fact, even exponential tails), so that 
by the Markov inequality 

W(Y^Zi > 7iv| extinction) < 7 ^E[^ Z x \ extinction] = O^y' 1 ) = o(l). (3.67) 
i i 

This completes the proof of Lemma 13.71 □ 
We must also show (ii), i.e., we have to show that the variance of Xn is bounded by B N = o(N 2 ). 
We will show: 

Lemma 3.8 There exists /3 > such that 

Var (X N ) = 0{N 2 - p ). (3.68) 

Proof. We follow the proof of Lemma 13.51 We rewrite 

Var(X„) = Var(y jv ) = E(Var JV (y jv )) + E(E N [Y N } 2 ) - E[Y N } 2 . (3.69) 

By Lemma [3. 51 E(Var Ar (Y Ar )) is certainly bounded by 0(N 2 ~P), and we are left to bound the second 
term. We start with r S (1, 2). We use (I3.58D to see that whp, and for some rj > 

P JV ( V Zf' N) < 7jv ) < fa ± l)N < AT-". (3.70) 



i 



Therefore, using that Y N = Yli=i — 7iv 



E(E N [Y N } 2 ) < E(N 2 F 2 N (Y, Z^ N) < j N )) + 0(N 2 ^r) = 0{N 2 ~ r >). (3.71) 



Therefore, (|3.68|) holds with j3 = r\. 

We next turn to r £ (2, 3) and r > 3, which we treat simultaneously. Using once more the fact 
that Y N = Y^Zi ^iCil < 7iv] and d51Hj) . we get 

N N /V -9 

Eat^at] = VPivdCil < 7a.) = VP^(Vzf w) < 7jv ) +0(— ^). (3.72) 

1=1 1=1 I 

Hence 

E(E N [Y N ] 2 ) - E[Y N ] 2 = WePM^Z^ < 7iv)Piv(^^ (3 ' JV) < 7*)] 

hi I i i 

- P(V^' JV) < 7 JV )P(VZ/- Ar) < j N ) 1 +0(^). (3.73) 
l i J ^ 

Now, by (|3.59j) . we can replace Pjv(X^ Z^ < j N ) by PjvEz %i — In), & t the cost of an additional 
error term 0{^ N p N ), where {Z^}i>\ for i = 1,2, ... ,N are independent copies of the branching 
process Z\. Since {Z^}i and {Z[ 3) }i are independent for i ^ j and their law is independent of the 
degree sequence, we have that 

< 7iv)^(E^' ) ^ ^) = p2 (E^ ^ ^ ( 3 - 74 ) 
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so that we obtain 



E(E n [Y n ] 2 )-E[Y n } 2 = 0(N 2 ^ N E[p N ])+0(^p^) = 0(N 2 ^), (3.75) 

by bounding the sum over i by N, and using (I3TUTD . which implies E[p N ] < N~^ Ali2 \ so that 
choosing < (3 < «2 A , kills the additional factor log N originating from j N . □ 
This concludes the proof of Theorem 11.21 We even obtain an improvement, since \/B N < N l ~@ 
for any < (3, so that whp the largest cluster is in between Nq N ± ujj^N 1 " 13 . □ 



4 Further bounds on connected components and diameter 
4.1 On connected components 

The proof of Theorem 11.31 is based on the following lemma. Recall that fk = ¥(D = k), k > 1. 

Lemma 4.1 Assume the conditions of Theorem \1.3l Suppose further that for 
0(\ogN), and some < 5 < 1/6, whp 



Nf k 



oo. 



(4.1) 



Then whp the random graph contains a connected component with k + 1 nodes. 



Proof. Take k such that > and consider the star-like connected component, with one node of 
degree k at the center and k nodes of degree 1 at the ends (see Figured]). 




Figure 1: A star-like connected component with k + 1 nodes. 

We will show that if the condition of the lemma holds, then, the random graph contains the 
above connected component whp. 

The main idea behind the proof is the following. There are whp at least /i(l — 5)N nodes of 
degree one. Hence, the probability that we connect a node of degree k to k nodes of degree 1 is at 
least 



h(l-5)N 



m-6) 



(4.2) 



Since, whp there are about Nfk nodes of degree k, we have about Nfk trials to make such a fc-star 
component. The mean number of successful trials is then about 

A' 



Nf k 



fi(l-S) 



OO, 



by condition (|4.ip . Hence, whp we expect to make at least one successful trial, and find a fc-star 
component. 
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We will now give the details of the proof. First we define a procedure which determines the 
existence of /c-stars in the random graph. Consider the process of pairing stubs in the graph. We 
are free to choose the order in which we pair them. Consider Dj l , . . . , Dj e , where we abbreviate 
£ N = N{k) to be equal to the number of the nodes with degree k, which we call k-nodes for brevity. 
We pair the stubs in the following order: 

Let SVy = Si,!, . . . , S lyk be the stubs of node = j\. We first pair S ltl . If it is paired with 
a stub of a node of degree 1, then we call this pairing successful and consider the pairing of S 1>2 - 
If S 1<2 is paired with a stub of a node of degree 1, then we call the second pairing successful and 
consider the pairing of S ii3 , and so on until the first moment when one of the two following things 
happens. The first case is that all stubs of node j\ are paired with nodes of degree 1. Then we 
observe a k — star component, we call the first trial successful and stop. The second case is that we 
come to I < k such that the I th pairing is unsuccessful, i.e., S lfl is not paired with a node of degree 
1. Then we call the whole trial unsuccessful and stop with pairing of the stubs in S w . In the later 
case it is possible that Sij is paired with another node in j2> •■ ■ 3i N - Such node can not turn into 
a fc-star anymore, we call this node, as well as node j±, used and discard them in the procedure. 

We define our successive trials inductively. For any m > 2, let be the first unused node in 
the sequence ji, . . -je N - Then, for we use the same procedure as with j* to determine whether 
the m th trial is successful or not. If the trial is not successful, then node jj^ becomes used, and if 
the corresponding unsuccessful pairing involves another unused /c-node, then we also call this node 
used. We always discard all used nodes from the procedure. 

We repeat these trials until we find a successful fc-node or until all fc-nodes are used. The 
Pjv-probability that the j th trial is successful is 

II gi^r w a « - w > *i, («) 

s=0 J 

where N(l) — L N (l,j) is the remaining number of free stubs of the degree 1 nodes up to the moment 
of the j th trial. Let r N (k) be the number of trials. Since at every unsuccessful trial the number 
of used nodes of degree k increases by at most two, we have T N (k) > [N(k)/2\. Instead of using 
all these trials, we will only use [S 2 N(k)\ of them. Then, after [S 2 N(k)\ trials, there are at least 
N(l) — k[S 2 N(k)\ remaining free stubs attached to nodes of degree 1. Hence, for j < [5 2 N(k)\, 
i~3l) is at least 

N(1) ~f kN{k) ) k I[N(k) > j]I[N(l) - 5 2 kN(k) > 0], (4.4) 

where we use that if N(l) - 5 2 kN(k) > 0, then, for all j < [<5 2 iV(fc)J - 1, we have that N(l) - 
L N (l,j) > k. Then, the P^-probability that all \_5 2 N(k)\ trials are unsuccessful is at most 

I[N(1) -5 2 kN(k) > 0]. (4.5) 




If we show that whp 



(t) N(l) - 5 2 kN(k) > (1- 5)Nf u 

(ii) N(k) > Nf k /2, 



(4.6) 



then, again whp, the probability that all [5 N(k)\ trials are unsuccessful is at most 




<expl-(^ A /2 + l)(WVU»(l), (4.7) 



i„ 1 ) ~ r \ ' ' 'V(WJV) 

due to (|4.1|) . where we further used that 1 — x < e~ x for x > 0. Hence, we are done if we prove 

^1D. 
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For r G (1,2), we have by assumption f k > 0, for some k < 72* (see 1 1 . 8() . It follows that k is 
bounded and so by the law of large numbers we have whp, 

(1 - 6)Nf k < N(k) < (1 + S)Nf k . (4.8) 

Hence, part (|4.6f ii)) is clear from the lower bound in (|4.8|) when 5 < 1/2. For part (|4.6( i)). we use 
(|4.8p together with the similar bound that whp 

(1 - 5 2 )7V/i < N(l) < (l + S^Nh. (4.9) 

Then 

N(l)-5 2 kN(k) > (i-5 2 )Nf 1 -5\l + 5)Nkf k = N[f 1 -5 2 (f 1 + (l + 5)kf k )] >Nh(l-S), (4.10) 

when 5 is sufficiently small, since k is fixed. This completes the proof of (|4.6p when r G (1,2). 

We turn to the case r > 2. In this case k = k N = 0(logN) and hence the law of large numbers 
does not apply. Instead, we use [21], which states that a binomial random variable X satisfies 

¥(\X - E[X}\ >t)<2e 2(E[x]+ t / 3 ) ; (4n) 

for all t > 0. We apply the above result with X = N(k), E[X] = Nf k and t = 5Nf k . Then we 
obtain for large enough N, 

S 2 Nf k 

F(\N(k) - Nf k \ > Sf k N) < 2e 2 (i+*/3) = o(l), (4.12) 

uniformly in N and k = k N = 0(logN). This yields (14. 8|) and hence (I4.6( ii)). whp, when 6 < 1/2. 
Furthermore, for r > 2 we obtain because the expectation Yljfj < 00 that kf k = o(l) when 
k — > 00. Hence, whp for large enough N, 

N(l) - 5 2 kN(k) > (1 - 5 2 )hN - S 2 (l + 5)kf k N/2 > (1 - 8)hN, 

uniformly in N and in k = k N = 0(log-/V), for 5 > small enough, so that (I4.6( i)) holds. This 
completes the proof of (|4.6p for r > 2. 

□ 

Proof of Theorem ll.3l (i). We check the conditions of Lemma 14.11 for r G (1,2). Firstly, since 
k < 7**, and 7** being constant, the condition k = k N = 0(logN), of Lemma 14.11 is trivially 
fulfilled. Secondly, we rewrite the left side of (|4.ip as 

Nf k f /l(1 ~^ V = h (/i(l - 5)) k e iogJV-fciog(Miv). (4 . 13 ) 

Fix 0<5'<5<l/6, and let e > be arbitrary. Since L N =£>! + ••• + Z)^, where is in the 
domain of attraction of a stable law ( [181 Corollary 2, XVII. 5, p. 578]), we have 

P (logfi N < (l + ^O^^logiVj =p(/i w < At( 1+<5 ')^t) =P ^ < AT^+^i 11 ^ > 1 - e, 

(4.14) 

since (2 — r)/(r — 1) > 0, for r G (1, 2). Thus, we obtain that, with probability at least 1 — e, 
Nf k ( fl{1 - S) ) h = f k (/!(! - ^eio^-itiogw > /fc _ ^jfc ^-(i+^ja-i) ^ 0C) (415) 

V ) 

for every k < 72*, where 7|* is defined in (jl.8j) . Therefore, the conditions of Lemma f4.1l are fulfilled, 
and Theorem ll.3f i) follows. □ 
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Proof of Theorem ll.3l (ii). We again use Lemma 14.11 and check its conditions. Firstly, by the 
condition of the theorem, k clearly satisfies k = k N = 0(log N). Secondly, we rewrite the expression 
in the left side of (|4.ip . using (jl.9p . and with 5 replaced by 5' , as 

Nfk f MLz*X \ = Lf{k)e -Tio g k e io g N + kio g (Mi-8')/ m) _ (416) 

V J 

Since r > 2, we have by the weak law of large numbers (w.l.l.n.), fi N — > \x in probability, as 
N — ► oo, so that whp /i N < fj,/(l — 5'). On this event, we then obtain the lower bound 

log N + k log ^Mlzilj > bg N - 7l ** log N ■ log ( ^ ^ ) (4.17) 

> log jv(l - 1 "/ [logW/i) - 21og(l - y)]) >^logiV, 
V log(/x//i) L V 2 

when <5' > is sufficiently small. Substituting the above lower bound in the right side of (|4.16p . we 
obtain from (jl.9p that, for sufficiently large N, whp 

Nf k f fl ^~ 6 'A > L f {k)e- Tlosk N 5 / 2 ->oo, asiV^oo, (4.18) 
V A*jv / 

where we have used that k = k N = 0(logN) so that e _Tlogfc = e _ °( loglogn ) . We conclude that the 
condition (|4,ip is fulfilled with some 5' > 0, and we have proved Theorem ll.3( ii). □ 

4.2 A lower bound on the diameter 

We now prove Theorem 11.51 which gives a lower bound on the diameter. 

Proof of Theorem 11.51 We start by proving the claim when fi > 0. The idea behind the 
proof is simple. Under the conditions of the theorem, one can find, whp , a path T(iV) in the 
random graph such that this path consists exclusively of nodes with degree 2 and has length at 
least 2a log N. This implies that the diameter D(G) is at least a log TV, since the above path could 
be a cycle. 

Below we define a procedure which proves the existence of such a path. Consider the process of 
pairing stubs in the graph. We are free to choose the order in which we pair the free stubs, since 
this order is irrelevant for the distribution of the random graph. Hence, we are allowed to start 
with pairing the stubs of the nodes of degree 2. 

Let S N (2) = (h, ■ ■ ■ ,i N (2)) e N^ 2 ) be the nodes of de gree 2, where we recall that N(2) is 
the number of such nodes. We will pair the stubs and at the same time define a permutation 
U(N) = (i 

*■>•••■> *j\r(2)) °f Sn(2), and a characteristic x(iV) — (xi, • • • , Xn(2)) on U(N), where Xj is 
either or 1. n(iV) and x(iV) will be defined inductively in such a way that for any node it € II(iV), 
Xk = 1, if and only if node i* k is connected to node Hence, if xC^O contains a substring of at 
least 2a log N ones then the random graph contains a path T(N) of length at least 2a log N. 

We initialize our inductive definition by i\ = i\. The node i\ has two stubs, we consider the 
second one and pair it to an arbitrary free stub. If this free stub belongs to another node j ^ i\ 
in S N {2) then we choose i\ = j and xi = 1> else we choose i\ = i-z, and Xi = 0- Suppose for some 
1 < k < N(2), the sequences (i*, . . . , i* k ) and (xi, • • • , Xfc-i) are defined. If Xfe-i = 1) then one stub 
of i* k is paired to a stub of i k _i, and another stub of i* k is free, else, if Xfc-i = 0) node i* k has two 
free stubs. Thus, node it has at least one free stub. We pair this stub to an arbitrary remaining 
free stub. If this second stub belongs to node j G S N (2) \ . . , i k }, then we choose i k+1 = j and 
Xk = 1) else we choose i* k+1 as the first stub in S N (2) \ {z*, . . . ,i k }, and Xfc = 0. Hence, we have 
defined Xk = 1) if and only if node i k is connected to node i k+ i- 
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We show that whp there exists a substring of ones of length at least 2a log N in the first 
half of xn, i-e., in xi( N ) = (Xi*,- ■ ■ ,Xi? Ml ^ /0 ,)- For this purpose, we couple the sequence Xi(iV) 

2 1 l N ( 2 )/ 2 } 2 

with a sequence Bi(N) = where £/% are i.i.d. Bernoulli random variables taking value 1 with 

2 

probability f 2 /(A/j,), and such that Xi* k > £fc> f° r all k £ {1, . . . , |_iV(2) /2j J- , whp. Indeed, for any 
1 < k < |_iV(2)/2j, the P^-probability that Xi* k = 1 is at least 

™{2)-C N (k) ^ (4ig) 
L N — Cjv(fe) 

where as before N(2) is the total number of nodes with degree 2, and C N {k) is the total number of 
paired stubs after k + 1 pairings. By definition of C N (k), for any k < N(2)/2, we have 

C N {k) = 2{k - 1) + 1 < 7V(2). (4.20) 
Due to the w.l.l.n. we also have that whp 

N{2) > f 2 N/2, L N < 2fiN. (4.21) 

Substitution of (|4.20p and (|4.2ip into f|4.19j) gives us that the right side of (|4.19p is at least 

m > /2 _ 
Ljv ~ 4^' 

Thus, whp we can stochastically dominate all coordinates of the random sequence Xl(N) with an 
i.i.d. Bernoulli sequence Bi(N) of iV/2/2 independent trials with success probability f 2 /(4^). It is 
well known (see [2] ) that the probability of existence of a run of 2a log N ones converges to one 
whenever 

2alogiV < 1 - g)- , 

|log(/ 2 /(4/i))| 

for some < q < 1. 

We conclude that whp the sequence Bi (N) contains a group (and hence a substring) of 2a log N 

2 

ones. Since whp Xn > Bi(N), where the ordering is componentwise, whp the sequence Xn also 

2 

contains the same substring of 2a log N ones, and hence there exists a required path consisting of 
at least 2a log N nodes with degree 2. Thus, whp the diameter is at least alogiV, and we have 
proved the theorem in the case that fi > 0. 

We now complete the proof of Theorem 11.51 when f 2 = by adapting the above argument. 
When f 2 = 0, and since fi + f 2 > 0, we must have that f\ > 0. Let k* > 2 be the smallest integer 
such that fk* > 0. This k* must exist, since f± < 1. Denote by iV*(2) the total number of nodes of 
degree k* of which its first k* — 2 stubs are connected to a node with degree 1. Thus, effectively, 
after the first k* — 2 stubs have been connected to nodes with degree 1, we are left with a structure 
which has 2 free stubs. These nodes will replace the iV(2) nodes used in the above proof. It is not 
hard to see that whp iV*(2) > f^N/2 for some f 2 > 0. Then, the argument for f 2 > can be 
repeated, replacing iV(2) by iV*(2) and f 2 by f 2 . In more detail, for any 1 < k < [N*(2)/(2k*)\ , 
the Pjy-probability that Xi* k = 1 is at least 

2N*(2)-C%(k) 

where C^(k) is the total number of paired stubs after k + 1 pairings of the free stubs incident to 
the JV*(2) nodes. By definition of C*(k), for any k < N*(2)/(2k*), we have 

C N (k) =2k*(k- 1) + 1 < N*(2). (4.23) 

Substitution of P~23j) . N*(2) > f 2 N/2 and the bound on L N in (jOT]) into (|4~22|) gives us that the 
right side of (|4.22j) is at least 

n*(2) > r 2 

Now the proof can be completed as above. We omit further details. □ 
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4.3 A log log upper bound on the diameter for r G (2,3) 

In this section, we investigate the diameter of the configuration model when /i + /2 = 0, or 
equivalently P(_D > 3) = 1. We assume (|1.13p for some r G (2,3). We will show that under 
these assumptions C F log log N is an upper bound on the diameter of G for some sufficiently large 
constant C F (see Theorem 11.60 . 

The proof is divided into two key steps. In the first, in Proposition 14.21 we & ve a bound on the 
diameter of the core of the configuration model consisting of all nodes with degree at least a certain 
power of log N. This argument is very close in spirit to the one in [25], the only difference being 
that we have simplified the argument slightly. After this, in Proposition 14.51 we derive a bound 
on the distance between nodes with small degree and the core. We note that Proposition 14.21 only 
relies on the assumption in (|1.13p . while Proposition 14.51 only relies on the fact that ¥(D > 3) = 1. 
We start by investigating the core of the configuration model. 

We take a > and define the core Core^r of the configuration model to be 

Cove N = {i: Di> (logiV)' 7 }, (4.24) 

i.e., the set of nodes with degree at least (log A r ) cr . Then, the diameter of the core is bounded in 
the following proposition: 

Proposition 4.2 (The diameter of the core) For every a > 3^7, the diameter of Core^r is 
bounded above by 

2 log log N , . 
|lo g (r-2)| " + °"^ < 425 > 

Proof. We note that (|1.13p implies that whp the largest degree satisfies 

D w >«i, where m = (log iV)" 1 , (4.26) 

because for N — ► 00, 

F(D (N) > u x ) = 1 - P(ZV) < «i) = 1 - [F( Ul )] N > 1 - (1 - cu\- T ) N 

\t-1\N 



c (iog y 7 



Define 



M w = {i : A > ui}, (4.28) 



so that, whp , AA (1> 7^ 0. For some constant C > 0, which will be specified later, and k > 2 we 
define recursively 

u k = C log N(u k ^) T ~ 2 . (4.29) 

Then, we define 

AA (fc) = {i : Di > u k }. (4.30) 

We start by identifying Uk- 

Lemma 4.3 (Identification of ujS) For each k G N, 

u k = C ak (log N) bk N Ck , (4.31) 

with 

(T-2) k - 2 , 1 4-t , , i_( r _2) fe - 1 
c k = - '-7— , h = - (T-2) k -\ a k = \ '- . 4.32 

T — 1 6 — T 6 — T 6 — T 
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Proof. We will identify a&, and c k recursively. We note that c\ = r3j,6i = — l,ai = 0. By 
(|4.29p . we can, for k>2, relate afc,6fc,Cfc to afe_i, Cfe_i as follows: 

Cfe = (r - 2)c fc _i, & fe = 1 + (r - 2)b k -i, a k = 1 + (r - 2)a fc _i. (4.33) 

As a result, we obtain 

c fc = (r - 2) fc - 1 ci = * , (4.34) 



T- 1 



3-r 

i=0 



(4.35) 



a fc = 1 ^ 2)fc " 1 . (4.36) 
3 — r 



□ 



The key step in the proof of Proposition 14.21 is the following lemma: 



Lemma 4.4 (Connectivity between AA (fc ~ 1) and M (h) ) Fix k > 2, and C > 4/i/c (see U.6\) , 
and $1.13\) respectively). Then, the probability that there exists an i £ AA (fc) that is not directly 
connected to AA (fe_1) is o(N~ s ), for some 5 > independent of k. 

Proof. We note that, by definition, 

£ A^Ufc-il^*- 15 !. (4.37) 

Also, 

|AA (fe_1) | ~ Bin (AT, 1 - F(u fc _!)), (4.38) 

and we have that, by (|1.13|) . 

iV[l - F(« fc _!)] > ciVK.!) 1 -, (4.39) 

which, by Lemma [4.31 grows as a positive power of N, since < C2 = < ~ rj- Using (|4.1ip . we 
obtain that the probability that |jV (fe_1) | is bounded below by N[l — F(uk—i)]/2 is exponentially 
small in N. As a result, we obtain that for every k, and whp 

E Di * l N i u k?- T - (4.40) 

ieA/W 

We note (see e.g., [201 (4.34)] that for any two sets of nodes A, B, we have that 



F N (A not directly connected to B) < e l at , (4-41) 
where, for any A C {1, . . . , JV}, we write 

Oa=£A- (4.42) 

On the event where |AA (fc_1) | > N[l - F(u fc _ 1 )]/2 and where Ljv < 2/xiV, we then obtain by (fOTj) . 

and Boole's inequality that the P^-probability that there exists an i 6 AA (fc) such that % is not 
directly connected to AA (fc_1) is bounded by 

^iV«fc_l[l-F("fc-l)] 



Ne ' 2L n < Ne ^ = N ^ , (4.43) 

where we have used (|4,29p . Taking C > 4/i/c proves the claim. □ 
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We now complete the proof of Proposition 14,21 Fix 

= 21oglogJV 

| log (r- 2)| 1 ' 

As a result of Lemma |4.4[ we have whp that the diameter of AA (fe J is at most 2k*, because the 
distance between any node in N {k ' and the node with degree Dr N \ is at most k*. Therefore, we 
are done when we can show that 

Corejv C J\fV°*K (4.45) 

For this, we note that 

AA (fe *> = {i: Di> u k *}, (4.46) 



so that it suffices to prove that u k * > (log N) a , for any a > According to Lemma [ 

u k * = C ak * (log N) bh * N Ck * . (4.47) 

Because for x — > oo, and 2 < r < 3, 

x(t -2)T^=W\ = x ■ x~ 2 = o(logx), (4.48) 

we find with x = log N that 

2 log log N 

log N ■ (t - 2) I lo « I = (log log N) , (4.49) 
implying: N c ** = (logA0° (1) , (logJV) 6 ** = (log N)^ +o{l \ and C a »* = (logA0° (1) . Thus, 

u k * = (logAf)3^ +o(1) , (4.50) 

so that, by picking sufficiently large, we can make ^-p + o(l) < a. This completes the proof of 

Proposition 14.21 □ 



Define 



C(m,e)=[ 1 - — - + l+ e) / logm, (4.51) 



where e > and m > 2 is an integer. 

Proposition 4.5 (The maximal distance between the periphery and the core) Assume that 
¥(D > m + 1) = 1 for some m > 2, and take e > 0. T/ien, whp , t/ie maximal distance between 
any node and the core is bounded from above by C(m, e) log log N. 

Proof. We start from a node i and will show that the probability that the distance between i 
and Corejv is at least C(m, e) log log N is o(A r ~ 1 ). This proves the claim. For this, we explore 
the neighborhood of i as follows. From i, we connect the first m + 1 stubs (ignoring the other 
ones). Then, successively, we connect the first m stubs from the closest node to i that we have 
connected to and have not yet been explored. We call the arising process when we have explored 
up to distance k from the initial node i the k- exploration tree. 

When we never connect two stubs between nodes we have connected to, then the number of 
nodes we can reach in k steps is precisely equal to (m + l)m k ~ 1 . We call an event where a stub on 
the £>exploration tree connects to a stub incident to a node in the £>exploration tree a collision. 
The number of collisions in the ^-exploration tree is the number of cycles or self- loops in it. When k 
increases, the probability of a collision increases. However, for k of order log log N, the probability 
that more than two collisions occur in the /c-exploration tree is small, as we will prove now: 
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Lemma 4.6 (Not more than one collision) Take k = \C(m, e) log log iV] . Then, the 

P ' N -probability that there exists a node of which the k- exploration tree has at least two collisions, 

before hitting the core Coie N , is bounded by (log N) d L~^ 2 , for d = 4C(m, e) log (m + 1) + 2a. 

Proof. For any stub in the /c-exploration tree, the probability that it will create a collision before 
hitting the core is bounded above by (m + l)m k ~ 1 (log N) a L^ 1 ■ The probability that two stubs will 
both create a collision is, by similar arguments, bounded above by Urn + l)m k ~ 1 (log N) a L^ 1 ] 2 ■ 
The total number of possible pairs of stubs in the fc-exploration tree is bounded by 

[(m + 1)(1 + m + . . . + m h ~ 1 )] 2 < [(m + l)m k } 2 , 

so that by Boole's inequality, the probability that the fc-exploration tree has at least two collisions 
is bounded by 

[(m + l)m k ] \log N) 2 °L~ 2 . (4.52) 

When k = C(m, e) log log N, we have that [(m + l)m k ] (logN) 2a < (logN) d , where 

d = 4C(m, e) log (m + 1) + 2a. □ 

Lemma 14.61 is interesting in its own right. For example, we will now use it together with Theorem 
ll.4t i) to prove Theorem 1 1.4f ii): 

Proof of Theorem \l.$[ ii). By Lemma l4.6[ there are at most 2 collisions in the /c-exploration tree 
from any vertex % £ {1, . . . , N} before hitting the core. As a result, for any i, we have that the 
fc-exploration tree contains at least min{(m — l)m k , (logN) a } stubs. When k = C(m, e) log log N, 
we have that (m — l)m k S> logiV, so that the fc-exploration tree contains at least KlogN stubs 
for some large enough K > 0. By Proposition 13.31 the connected component of i has whp at 
least eL N edges, and, in turn, by Lemma 13.41 at least r]N nodes. By Theorem ll.4( i) (which has 
already been proved in Section [2] and which applies, since ¥(D > 3) = 1 and fi > 3 > 2 when 
¥(D > 3) = 1), we have that the size of the complement of the largest connected component is 
bounded whp . Therefore, we must have that i is part of the giant component. Since this is true 
for every i G {1, . . . , N}, we obtain that the giant component must have size N, so that the random 
graph is connected. □ 

Finally, we show that for k = C(m, e) log log N, the fe-exploration tree will, whp connect to the 
Coreiv: 

Lemma 4.7 (Connecting the exploration tree to the core) Take k = C(m, e) log log N. Then, 
the probability that there exists an i such that the distance of i to the core is at least k is o(N~ 1 ). 

Proof. Since /i < oo we have that L N /N ~ [i. Then, by Lemma 14.61 the probability that 
there exists a node for which the /c-exploration tree has at least 2 collisions before hitting the 
core is o(N~ 1 ). When the /c-exploration tree from a node i does not have two collisions, then 
there are at least (m — l)m k ~ 1 stubs in the k th layer that have not yet been connected. When 
k = C(m,e)loglogN this number is at least equal to (log jV) c '( m ' e ) losm+0 ( 1 ). Furthermore, the 
number of stubs incident to the core Core N is stochastically bounded from below by (log N) a times 
a binomial distribution with parameters N and success probability P(Di > (log N) a ). The expected 
number of stubs incident to Core N is therefore at least N(logN) a ¥(D\ > (logN) a ) so that whp 
the number of stubs incident to Corejv is at least (by (|4.1ip ) 

^iV(logiV) ,T P( J Di > (logiV)' 7 ) > ^iV(logiV)^. (4.53) 

By (j4.4ip , the probability that we connect none of the stubs of the fc-exploration tree to one of the 
stubs incident to Corejv is bounded by 

cN(logN)^ C ^^ m \ ^ _ f c n _ , n - +CK£)logm 



exp <^ — S < exp I - — (log N)3-r ^ ' > \ = o(N ), (4.54) 
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because whp L N /N < 2fi, and since g— p + C(m,e) logm = 1 + e. □ 

Propositions 14.21 and 14.51 prove that whp the diameter of the configuration model is bounded 
above by C F log log N, where 

2 2(p^ + l + e) 

C F = ,. , ... + (4.55) 

| log (r — 2 ) I log m 

This completes the proof of Theorem 11.61 □ 
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